Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] X64 - Centralize peephole optimization for removing redundant mov instructions #85780

Merged
merged 10 commits into from
May 11, 2023

Conversation

TIHan
Copy link
Contributor

@TIHan TIHan commented May 4, 2023

Description

This centralizes checking for redundant mov instructions.

Depending on what we do for #85734 , we need to counter some of the regressions that may occur as a result of not removing CAST nodes.

This also includes more peephole optimizations for movzx, movsx and movsxd.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 4, 2023
@ghost ghost assigned TIHan May 4, 2023
@ghost
Copy link

ghost commented May 4, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Depending on what we do for #85734 , we need to counter some of the regressions that may occur as a result of not removing CAST nodes. For x64, we can do this by looking for redundant movzx and movsx instructions.

Author: TIHan
Assignees: TIHan
Labels:

area-CodeGen-coreclr

Milestone: -

@TIHan TIHan changed the title Added AreUpperBitsZero and AreUpperBitsSignExtended [JIT] X64 - Peephole optimization for removing redundant movzx and movsx instructions May 4, 2023
@TIHan TIHan changed the title [JIT] X64 - Peephole optimization for removing redundant movzx and movsx instructions [JIT] X64 - Centralization peephole optimization for removing redundant mov instructions May 4, 2023
@TIHan TIHan changed the title [JIT] X64 - Centralization peephole optimization for removing redundant mov instructions [JIT] X64 - Centralize peephole optimization for removing redundant mov instructions May 4, 2023
@TIHan
Copy link
Contributor Author

TIHan commented May 5, 2023

/azp run runtime-coreclr gcstress0x3-gcstress0xc

@TIHan
Copy link
Contributor Author

TIHan commented May 5, 2023

/azp run Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@TIHan TIHan marked this pull request as ready for review May 5, 2023 01:12
@TIHan
Copy link
Contributor Author

TIHan commented May 5, 2023

/azp run Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@TIHan
Copy link
Contributor Author

TIHan commented May 5, 2023

@dotnet/jit-contrib this is ready. PTAL @BruceForstall - Diffs

All diff improvements, and we get a TP win for x64.
There is a significant TP regression for x86 which is likely because we never actually did any of these optimizations for x86 - we are now eliminating redundant movsx and movzx instructions.

@TIHan
Copy link
Contributor Author

TIHan commented May 5, 2023

Thinking about it, I'm not sure why there are TP regressions for x86 for MinOpts. These optimizations only kick in when optimizations are on.

src/coreclr/jit/emitxarch.cpp Show resolved Hide resolved
src/coreclr/jit/emitxarch.cpp Show resolved Hide resolved
{
result = (id->idOpSize() <= size);
}
#ifdef TARGET_64BIT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 32-bit will EA_4BYTE ever be passed? If not, we don't technically don't need the ifdef?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size can be EA_4BYTE on 32-bit - ifdef'ing this only for 64-bit just does less work on 32-bit.

result = (id->idOpSize() <= size);
}
#ifdef TARGET_64BIT
// movsx/movsxd always sign extends to 8 bytes. W-bit is set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the W-bit matter here?

Copy link
Contributor Author

@TIHan TIHan May 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W-bit on the encoding is what makes movsx and movsxd sign-extend to 8 bytes. We always default to using the W-bit. Technically, we could have a movsx or movsxd instruction not set the W-bit which would not sign-extend to 8 bytes (that would be a problem and make this optimization not safe), but we never actually emit those instructions that way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function cares about instructions and sizes, but not encoding, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment W-bit is set itself is just meant to inform the reason why this optimization is safe to do. It was already there before.

@ghost ghost added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels May 6, 2023
@TIHan
Copy link
Contributor Author

TIHan commented May 8, 2023

@BruceForstall This is ready again.

@TIHan
Copy link
Contributor Author

TIHan commented May 10, 2023

@BruceForstall This is ready again. The optimizations only happen for X64, which was the case before, so no TP regressions for X86.

@BruceForstall
Copy link
Member

Diffs

@TIHan TIHan merged commit e87dafa into dotnet:main May 11, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants